- Home
- Search Results
- Page 1 of 1
Search for: All records
- 
                                    Total Resources4
- Resource Type
- 
                                    
                                    
                                    
                                    0002100001000000
- More
- Availability
- 
                                    
                                    31
- Author / Contributor
- Filter by Author / Creator
- Filter by Editor
- 
                                    
                                        - 
                                                    
                                                        
                                                            
                                                            null (1)
- 
                                                    
                                                        
                                                            
                                                            & Spizer, S. M. (0)
- 
                                                    
                                                        
                                                            
                                                            & . Spizer, S. (0)
- 
                                                    
                                                        
                                                            
                                                            & Ahn, J. (0)
- 
                                                    
                                                        
                                                            
                                                            & Bateiha, S. (0)
- 
                                                    
                                                        
                                                            
                                                            & Bosch, N. (0)
- 
                                                    
                                                        
                                                            
                                                            & Brennan K. (0)
- 
                                                    
                                                        
                                                            
                                                            & Brennan, K. (0)
- 
                                                    
                                                        
                                                            
                                                            & Chen, B. (0)
- 
                                                    
                                                        
                                                            
                                                            & Chen, Bodong (0)
- 
                                                    
                                                        
                                                            
                                                            & Drown, S. (0)
- 
                                                    
                                                        
                                                            
                                                            & Ferretti, F. (0)
- 
                                                    
                                                        
                                                            
                                                            & Higgins, A. (0)
- 
                                                    
                                                        
                                                            
                                                            & J. Peters (0)
- 
                                                    
                                                        
                                                            
                                                            & Kali, Y. (0)
- 
                                                    
                                                        
                                                            
                                                            & Ruiz-Arias, P.M. (0)
- 
                                                    
                                                        
                                                            
                                                            & S. Spitzer (0)
- 
                                                    
                                                        
                                                            
                                                            & Sahin. I. (0)
- 
                                                    
                                                        
                                                            
                                                            & Spitzer, S. (0)
- 
                                                    
                                                        
                                                            
                                                            & Spitzer, S.M. (0)
 
- 
                                                    
                                                        
                                                            
                                                            
- 
                                    Have feedback or suggestions for a way to improve these results?
 !
                                    
                                        
                                            Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            Large language models (LLMs) are notoriously memory-intensive during training, particularly with the popular AdamW optimizer. This memory burden necessitates using more or higher-end GPUs or reducing batch sizes, limiting training scalability and throughput. To address this, various memory-efficient optimizers have been proposed to reduce optimizer memory usage. However, they face critical challenges: (i) reliance on costly SVD operations; (ii) significant performance trade-offs compared to AdamW; and (iii) still substantial optimizer memory overhead to maintain competitive performance. In this work, we identify that AdamW's learning rate adaptation rule can be effectively coarsened as a structured learning rate update. Based on this insight, we propose Approximated Gradient Scaling for Memory-Efficient LLM Optimization (APOLLO), which approximates learning rate scaling using an auxiliary low-rank optimizer state based on pure random projection. This structured learning rate update rule makes APOLLO highly tolerant to further memory reductions while delivering comparable pre-training performance. Even its rank-1 variant, APOLLO-Mini, achieves superior pre-training performance compared to AdamW with SGD-level memory costs. Extensive experiments demonstrate that the APOLLO series performs on-par with or better than AdamW, while achieving greater memory savings by nearly eliminating the optimization states of AdamW. These savings provide significant system-level benefits: (1) Enhanced Throughput: 3x throughput on an 8xA100-80GB setup compared to AdamW by supporting 4x larger batch sizes. (2) Improved Model Scalability: Pre-training LLaMA-13B with naive DDP on A100-80GB GPUs without system-level optimizations. (3) Low-End GPU Friendly Pre-training: Pre-training LLaMA-7B on a single GPU using less than 12 GB of memory with weight quantization.more » « lessFree, publicly-accessible full text available February 17, 2026
- 
            Cheng, H.-P.; Zhang, T.; Zhang, Y.; Li, S.; Liang, F.; Yan, F.; Li, M.; Chandra, V.; Li, H.; Chen, Y. (, AAAI Conference on Artificial Intelligence (AAAI 2021))
- 
            Cheng, H.-P.; Zhang, T.; Zhang, Y.; Li, S.; Liang, F.; Yan, F.; Li, M.; Chandra, V.; Li, H.; Chen, Y. (, The Thirty-Fifth AAAI Conference on Artificial Intelligence)null (Ed.)
- 
            Al Aamery, Nabil; Fox, James F.; Snyder, Mark; Chandramouli, Chandra V. (, Journal of Hydrology)
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                     Full Text Available
                                                Full Text Available